Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Differential privacy clustering algorithm in horizontal federated learning
Xueran XU, Geng YANG, Yuxian HUANG
Journal of Computer Applications    2024, 44 (1): 217-222.   DOI: 10.11772/j.issn.1001-9081.2023010019
Abstract450)   HTML7)    PDF (1418KB)(225)       Save

Clustering analysis can uncover hidden interconnections between data and segment the data according to multiple indicators, which can facilitate personalized and refined operations. However, data fragmentation and isolation caused by data islands seriously affects the effectiveness of cluster analysis applications. To solve data island problem and protect data privacy, an Equivalent Local differential privacy Federated K-means (ELFedKmeans) algorithm was proposed. A grid-based initial cluster center selection method and a privacy budget allocation scheme were designed for the horizontal federation learning model. To generate same random noise with lower communication cost, all organizations jointly negotiated random seeds, protecting local data privacy. The ELFedKmeans algorithm was demonstrated satisfying differential privacy protection through theoretical analysis, and it was also compared with Local Differential Privacy distributed K-means (LDPKmeans) algorithm and Hybrid Privacy K-means (HPKmeans) algorithm on different datasets. Experimental results show that all three algorithms increase F-measure and decrease SSE (Sum of Squares due to Error) gradually as privacy budget increases. As a whole, the F-measure values of ELFedKmeans algorithm was 1.794 5% to 57.066 3% and 21.245 2% to 132.048 8% higher than those of LDPKmeans and HPKmeans algorithms respectively; the Log(SSE) values of ELFedKmeans algorithm were 1.204 2% to 12.894 6% and 5.617 5% to 27.575 2% less than those of LDPKmeans and HPKmeans algorithms respectively. With the same privacy budget, ELFedKmeans algorithm outperforms the comparison algorithms in terms of clustering quality and utility metric.

Table and Figures | Reference | Related Articles | Metrics